En este apartado, descargaremos nuestro conjunto de datos sobre el HELOC.
HELOC DATABASE
| Name | Datos1 |
| Number of rows | 4004 |
| Number of columns | 24 |
| _______________________ | |
| Column type frequency: | |
| character | 1 |
| numeric | 23 |
| ________________________ | |
| Group variables | None |
Variable type: character
| skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
|---|---|---|---|---|---|---|---|
| RiskPerformance | 0 | 1 | 3 | 4 | 0 | 2 | 0 |
Variable type: numeric
| skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| ExternalRiskEstimate | 0 | 1 | 75.64 | 8.66 | 48 | 70 | 76 | 82 | 94 | ▁▃▆▇▃ |
| MSinceOldestTradeOpen | 0 | 1 | 188.08 | 91.79 | -8 | 130 | 179 | 243 | 446 | ▃▇▇▃▁ |
| MSinceMostRecentTradeOpen | 0 | 1 | 6.84 | 5.24 | 0 | 3 | 5 | 10 | 23 | ▇▆▂▂▁ |
| AverageMInFile | 0 | 1 | 73.76 | 27.75 | 4 | 56 | 73 | 91 | 159 | ▂▆▇▃▁ |
| NumSatisfactoryTrades | 0 | 1 | 21.49 | 9.22 | 1 | 15 | 21 | 27 | 49 | ▂▇▆▃▁ |
| NumTrades60Ever2DerogPubRec | 0 | 1 | 0.10 | 0.34 | 0 | 0 | 0 | 0 | 2 | ▇▁▁▁▁ |
| NumTrades90Ever2DerogPubRec | 0 | 1 | 0.00 | 0.00 | 0 | 0 | 0 | 0 | 0 | ▁▁▇▁▁ |
| PercentTradesNeverDelq | 0 | 1 | 95.84 | 6.87 | 68 | 94 | 100 | 100 | 100 | ▁▁▁▁▇ |
| MSinceMostRecentDelq | 0 | 1 | 1.02 | 12.97 | -8 | -7 | -7 | 6 | 45 | ▇▂▁▁▁ |
| MaxDelq2PublicRecLast12M | 0 | 1 | 6.20 | 1.17 | 3 | 6 | 7 | 7 | 9 | ▂▁▃▇▁ |
| MaxDelqEver | 0 | 1 | 7.12 | 1.12 | 2 | 6 | 8 | 8 | 8 | ▁▁▁▃▇ |
| NumTotalTrades | 0 | 1 | 22.44 | 10.58 | 0 | 15 | 22 | 29 | 54 | ▂▇▇▃▁ |
| NumTradesOpeninLast12M | 0 | 1 | 1.95 | 1.59 | 0 | 1 | 2 | 3 | 7 | ▇▃▅▁▁ |
| PercentInstallTrades | 0 | 1 | 34.53 | 15.01 | 2 | 23 | 33 | 44 | 80 | ▂▇▆▃▁ |
| MSinceMostRecentInqexcl7days | 0 | 1 | -0.27 | 4.73 | -8 | 0 | 0 | 1 | 13 | ▅▇▃▁▁ |
| NumInqLast6M | 0 | 1 | 1.18 | 1.29 | 0 | 0 | 1 | 2 | 5 | ▇▂▁▁▁ |
| NumInqLast6Mexcl7days | 0 | 1 | 1.13 | 1.26 | 0 | 0 | 1 | 2 | 5 | ▇▂▁▁▁ |
| NetFractionRevolvingBurden | 0 | 1 | 28.65 | 25.19 | -8 | 7 | 22 | 44 | 120 | ▇▆▃▂▁ |
| NetFractionInstallBurden | 0 | 1 | 46.80 | 40.46 | -8 | -8 | 59 | 83 | 196 | ▇▆▇▁▁ |
| NumRevolvingTradesWBalance | 0 | 1 | 3.50 | 1.97 | 0 | 2 | 3 | 5 | 9 | ▃▇▅▂▁ |
| NumInstallTradesWBalance | 0 | 1 | 2.31 | 1.21 | 1 | 1 | 2 | 3 | 6 | ▇▃▁▁▁ |
| NumBank2NatlTradesWHighUtilization | 0 | 1 | 0.61 | 0.74 | 0 | 0 | 0 | 1 | 2 | ▇▁▅▁▂ |
| PercentTradesWBalance | 0 | 1 | 63.31 | 20.40 | 7 | 50 | 63 | 78 | 100 | ▁▅▇▇▅ |
Obtener un resumen estadístico de todas las variables.
| RiskPerformance | ExternalRiskEstimate | MSinceOldestTradeOpen | MSinceMostRecentTradeOpen | AverageMInFile | NumSatisfactoryTrades | NumTrades60Ever2DerogPubRec | NumTrades90Ever2DerogPubRec | PercentTradesNeverDelq | MSinceMostRecentDelq | MaxDelq2PublicRecLast12M | MaxDelqEver | NumTotalTrades | NumTradesOpeninLast12M | PercentInstallTrades | MSinceMostRecentInqexcl7days | NumInqLast6M | NumInqLast6Mexcl7days | NetFractionRevolvingBurden | NetFractionInstallBurden | NumRevolvingTradesWBalance | NumInstallTradesWBalance | NumBank2NatlTradesWHighUtilization | PercentTradesWBalance | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Length:4004 | Min. :48.00 | Min. : -8.0 | Min. : 0.000 | Min. : 4.00 | Min. : 1.00 | Min. :0.0000 | Min. :0 | Min. : 68.00 | Min. :-8.000 | Length:4004 | Length:4004 | Min. : 0.00 | Min. :0.000 | Min. : 2.00 | Min. :-8.000 | Min. :0.000 | Min. :0.000 | Min. : -8.00 | Min. : -8.0 | Min. : -8.0 | Min. :1.000 | Min. :0.0000 | Min. : 7.00 | |
| Class :character | 1st Qu.:70.00 | 1st Qu.:130.0 | 1st Qu.: 3.000 | 1st Qu.: 56.00 | 1st Qu.:15.00 | 1st Qu.:0.0000 | 1st Qu.:0 | 1st Qu.: 94.00 | 1st Qu.:-7.000 | Class :character | Class :character | 1st Qu.:15.00 | 1st Qu.:1.000 | 1st Qu.:23.00 | 1st Qu.: 0.000 | 1st Qu.:0.000 | 1st Qu.:0.000 | 1st Qu.: 7.00 | 1st Qu.: -8.0 | 1st Qu.: -8.0 | 1st Qu.:1.000 | 1st Qu.:0.0000 | 1st Qu.: 50.00 | |
| Mode :character | Median :76.00 | Median :179.0 | Median : 5.000 | Median : 73.00 | Median :21.00 | Median :0.0000 | Median :0 | Median :100.00 | Median :-7.000 | Mode :character | Mode :character | Median :22.00 | Median :2.000 | Median :33.00 | Median : 0.000 | Median :1.000 | Median :1.000 | Median : 22.00 | Median : 59.0 | Median : 59.0 | Median :2.000 | Median :0.0000 | Median : 63.00 | |
| NA | Mean :75.64 | Mean :188.1 | Mean : 6.839 | Mean : 73.76 | Mean :21.49 | Mean :0.1001 | Mean :0 | Mean : 95.84 | Mean : 1.021 | NA | NA | Mean :22.44 | Mean :1.949 | Mean :34.53 | Mean :-0.274 | Mean :1.181 | Mean :1.131 | Mean : 28.65 | Mean : 46.8 | Mean : 46.8 | Mean :2.309 | Mean :0.6116 | Mean : 63.31 | |
| NA | 3rd Qu.:82.00 | 3rd Qu.:243.0 | 3rd Qu.:10.000 | 3rd Qu.: 91.00 | 3rd Qu.:27.00 | 3rd Qu.:0.0000 | 3rd Qu.:0 | 3rd Qu.:100.00 | 3rd Qu.: 6.000 | NA | NA | 3rd Qu.:29.00 | 3rd Qu.:3.000 | 3rd Qu.:44.00 | 3rd Qu.: 1.000 | 3rd Qu.:2.000 | 3rd Qu.:2.000 | 3rd Qu.: 44.00 | 3rd Qu.: 83.0 | 3rd Qu.: 83.0 | 3rd Qu.:3.000 | 3rd Qu.:1.0000 | 3rd Qu.: 78.00 | |
| NA | Max. :94.00 | Max. :446.0 | Max. :23.000 | Max. :159.00 | Max. :49.00 | Max. :2.0000 | Max. :0 | Max. :100.00 | Max. :45.000 | NA | NA | Max. :54.00 | Max. :7.000 | Max. :80.00 | Max. :13.000 | Max. :5.000 | Max. :5.000 | Max. :120.00 | Max. :196.0 | Max. :196.0 | Max. :6.000 | Max. :2.0000 | Max. :100.00 |
Se realizara el histograma para la variable categórica:
Se realizaran todos los histogramas de todas las variables numéricas:
Aquí se realizaran distintos histogramas para todas la variables numéricas pero distinguiendo con los tipos que hay dentro de la variables categórica:
Las pruebas estadísticas comunes para verificar la normalidad incluyen:
Prueba de Shapiro-Wilk
| Variable | P_Value | |
|---|---|---|
| AverageMInFile | AverageMInFile | 2.106e-10 |
| NumTotalTrades | NumTotalTrades | 7.264e-19 |
| MSinceOldestTradeOpen | MSinceOldestTradeOpen | 1.489e-19 |
| PercentInstallTrades | PercentInstallTrades | 2.487e-21 |
| PercentTradesWBalance | PercentTradesWBalance | 3.011e-22 |
| NumSatisfactoryTrades | NumSatisfactoryTrades | 1.109e-22 |
| ExternalRiskEstimate | ExternalRiskEstimate | 4.986e-26 |
| NetFractionRevolvingBurden | NetFractionRevolvingBurden | 5.228e-44 |
| NumTradesOpeninLast12M | NumTradesOpeninLast12M | 4.275e-45 |
| MSinceMostRecentTradeOpen | MSinceMostRecentTradeOpen | 1.014e-46 |
| MSinceMostRecentInqexcl7days | MSinceMostRecentInqexcl7days | 1.997e-49 |
| NumInstallTradesWBalance | NumInstallTradesWBalance | 2.900e-50 |
| NetFractionInstallBurden | NetFractionInstallBurden | 5.424e-52 |
| NumRevolvingTradesWBalance | NumRevolvingTradesWBalance | 5.424e-52 |
| NumInqLast6M | NumInqLast6M | 1.042e-54 |
| NumInqLast6Mexcl7days | NumInqLast6Mexcl7days | 2.249e-55 |
| NumBank2NatlTradesWHighUtilization | NumBank2NatlTradesWHighUtilization | 4.228e-62 |
| MSinceMostRecentDelq | MSinceMostRecentDelq | 1.432e-65 |
| PercentTradesNeverDelq | PercentTradesNeverDelq | 2.327e-66 |
| NumTrades60Ever2DerogPubRec | NumTrades60Ever2DerogPubRec | 1.471e-81 |
| NumTrades90Ever2DerogPubRec | NumTrades90Ever2DerogPubRec | NA |
Visualmente, puedes usar histogramas, gráficos Q-Q, y gráficos de densidad para evaluar la normalidad.
Prueba de Shapiro-Wilk: Verifica si cada variable numérica sigue una distribución normal. Los valores p menores a 0.05 indican desviación significativa de la normalidad.
Histograma con Densidad: Muestra la distribución de la variable con una curva de densidad para visualizar la normalidad.
Gráfico Q-Q: Compara los cuantiles de la variable con los cuantiles de una distribución normal teórica. Los puntos deberían alinearse con la línea si los datos son normales.